AITopics | Manchester

Collaborating Authors

Manchester

6a711a119a8a7a9f877b5f379bfe9ea2-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 05:15:04 GMT

equation, loss function, predictor, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Connecticut > Hartford County > Manchester (0.04)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

A Theorem Proofs

Neural Information Processing SystemsAug-15-2025, 00:10:00 GMT

In this section, we present the proofs to the theorems introduced in the main paper. The proof to Theorem 2 is presented as follows. Consider a classification task where the loss function is the cross entropy loss. This approximately holds for many applications with over-parameterized neural predictors. In this case, we have the following theorem: Theorem 3. If Equations (18) and (19) hold, that This contradicts with Equation (23).

equation, loss function, predictor, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Connecticut > Hartford County > Manchester (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models

Zhang, Siwei, Xiong, Yun, Tang, Yateng, Chen, Xi, Jia, Zian, Gu, Zehao, Xu, Jiarong, Zhang, Jiawei

arXiv.org Artificial IntelligenceMar-18-2025

Temporal graph neural networks (TGNNs) have shown remarkable performance in temporal graph modeling. However, real-world temporal graphs often possess rich textual information, giving rise to temporal text-attributed graphs (TTAGs). Such combination of dynamic text semantics and evolving graph structures introduces heightened complexity. Existing TGNNs embed texts statically and rely heavily on encoding mechanisms that biasedly prioritize structural information, overlooking the temporal evolution of text semantics and the essential interplay between semantics and structures for synergistic reinforcement. To tackle these issues, we present \textbf{{Cross}}, a novel framework that seamlessly extends existing TGNNs for TTAG modeling. The key idea is to employ the advanced large language models (LLMs) to extract the dynamic semantics in text space and then generate expressive representations unifying both semantics and structures. Specifically, we propose a Temporal Semantics Extractor in the {Cross} framework, which empowers the LLM to offer the temporal semantic understanding of node's evolving contexts of textual neighborhoods, facilitating semantic dynamics. Subsequently, we introduce the Semantic-structural Co-encoder, which collaborates with the above Extractor for synthesizing illuminating representations by jointly considering both semantic and structural information while encouraging their mutual reinforcement. Extensive experimental results on four public datasets and one practical industrial dataset demonstrate {Cross}'s significant effectiveness and robustness.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2503.14411

Country:

North America > United States > Connecticut > Hartford County > Manchester (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Yolo County > Davis (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Consumer Products & Services > Restaurants (0.68)
Information Technology > Services (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AILS-NTUA at SemEval-2025 Task 4: Parameter-Efficient Unlearning for Large Language Models using Data Chunking

Premptis, Iraklis, Lymperaiou, Maria, Filandrianos, Giorgos, Mastromichalakis, Orfeas Menis, Voulodimos, Athanasios, Stamou, Giorgos

arXiv.org Artificial IntelligenceMar-4-2025

The Unlearning Sensitive Content from Large Language Models task aims to remove targeted datapoints from trained models while minimally affecting their general knowledge. In our work, we leverage parameter-efficient, gradient-based unlearning using low-rank (LoRA) adaptation and layer-focused fine-tuning. To further enhance unlearning effectiveness, we employ data chunking, splitting forget data into disjoint partitions and merging them with cyclically sampled retain samples at a pre-defined ratio. Our task-agnostic method achieves an outstanding forget-retain balance, ranking first on leaderboards and significantly outperforming baselines and competing systems.

batch size, epoch, hyperparameter, (15 more...)

arXiv.org Artificial Intelligence

2503.02443

Country:

North America > United States > Kentucky > Jefferson County > Louisville (0.04)
Europe > Italy (0.04)
North America > United States > Massachusetts (0.04)
(14 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Privacy-Preserving Instructions for Aligning Large Language Models

Yu, Da, Kairouz, Peter, Oh, Sewoong, Xu, Zheng

arXiv.org Artificial IntelligenceJul-2-2024

Service providers of large language model (LLM) applications collect user instructions in the wild and use them in further aligning LLMs with users' intentions. These instructions, which potentially contain sensitive information, are annotated by human workers in the process. This poses a new privacy risk not addressed by the typical private optimization. To this end, we propose using synthetic instructions to replace real instructions in data annotation and model fine-tuning. Formal differential privacy is guaranteed by generating those synthetic instructions using privately fine-tuned generators. Crucial in achieving the desired utility is our novel filtering algorithm that matches the distribution of the synthetic instructions to that of the real ones. In both supervised fine-tuning and reinforcement learning from human feedback, our extensive experiments demonstrate the high utility of the final set of synthetic instructions by showing comparable results to real instructions. In supervised fine-tuning, models trained with private synthetic instructions outperform leading open-source models such as Vicuna.

arxiv preprint arxiv, instruction, synthetic instruction, (13 more...)

arXiv.org Artificial Intelligence

2402.13659

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Maryland > Anne Arundel County > Odenton (0.04)
North America > United States > Connecticut > Hartford County > Manchester (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Interlocking Dynamics of Cooperative Rationalization

Yu, Mo, Zhang, Yang, Chang, Shiyu, Jaakkola, Tommi S.

arXiv.org Artificial IntelligenceOct-26-2021

Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output. The selection mechanism is commonly integrated into the model itself by specifying a two-component cascaded system consisting of a rationale generator, which makes a binary selection of the input features (which is the rationale), and a predictor, which predicts the output based only on the selected features. The components are trained jointly to optimize prediction performance. In this paper, we reveal a major problem with such cooperative rationalization paradigm -- model interlocking. Interlocking arises when the predictor overfits to the features selected by the generator thus reinforcing the generator's selection even if the selected rationales are sub-optimal. The fundamental cause of the interlocking problem is that the rationalization objective to be minimized is concave with respect to the generator's selection policy. We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection. The generator now realizes both soft and hard attention over the features and these are fed into the two different predictors. While the generator still seeks to support the original predictor performance, it also minimizes a gap between the two predictors. As we will show theoretically, since the attention-based predictor exhibits a better convexity property, A2R can overcome the concavity barrier. Our experiments on two synthetic benchmarks and two real datasets demonstrate that A2R can significantly alleviate the interlock problem and find explanations that better align with human judgments. We release our code at https://github.com/Gorov/Understanding_Interlocking.

arxiv preprint arxiv, predictor, rationale, (15 more...)

arXiv.org Artificial Intelligence

2110.1388

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Connecticut > Hartford County > Manchester (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.67)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback